AITopics | data owner

Collaborating Authors

data owner

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection

Neural Information Processing SystemsJun-15-2026, 19:42:14 GMT

Availability attacks, or unlearnable examples, are defensive techniques that allow data owners to modify their datasets in ways that prevent unauthorized machine learning models from learning effectively while maintaining the data's intended functionality. It has led to the release of popular black-box tools (e.g., APIs) for users to upload personal data and receive protected counterparts. In this work, we show that such black-box protections can be substantially compromised if a small set of unprotected in-distribution data is available. Specifically, we propose a novel threat model of protection leakage, where an adversary can (1) easily acquire (unprotected, protected) pairs by querying the blackbox protections with a small unprotected dataset; and (2) train a diffusion bridge model to build a mapping between unprotected and protected data. This mapping, termed BridgePure, can effectively remove the protection from any previously unseen data within the same distribution. BridgePure demonstrates superior purification performance on classification and style mimicry tasks, exposing critical vulnerabilities in black-box data protection. We suggest that practitioners implement multi-level countermeasures to mitigate such risks.

artificial intelligence, machine learning, protection, (17 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Can Graph Neural Networks Expose Training Data Properties? An Efficient Risk Assessment Approach

Neural Information Processing SystemsMar-21-2026, 07:58:58 GMT

Graph neural networks (GNNs) have attracted considerable attention due to their diverse applications.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

d346d91999074dd8d6073d4c3b13733b-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 03:01:45 GMT

dataset, model trainer, noise, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Subgraph Federated Learning with Missing Neighbor Generation

Neural Information Processing SystemsFeb-8-2026, 04:56:27 GMT

Graphs have been widely used in data mining and machine learning due to their unique representation of real-world objects and their interactions.

data mining, machine learning, subgraph, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > Canada > British Columbia (0.04)
Asia > China > Hong Kong (0.04)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

How to Securely Shuffle? A survey about Secure Shufflers for privacy-preserving computations

Damie, Marc, Hahn, Florian, Peter, Andreas, Ramon, Jan

arXiv.org Artificial IntelligenceDec-2-2025

Ishai et al. (FOCS'06) introduced secure shuffling as an efficient building block for private data aggregation. Recently, the field of differential privacy has revived interest in secure shufflers by highlighting the privacy amplification they can provide in various computations. Although several works argue for the utility of secure shufflers, they often treat them as black boxes; overlooking the practical vulnerabilities and performance trade-offs of existing implementations. This leaves a central question open: what makes a good secure shuffler? This survey addresses that question by identifying, categorizing, and comparing 26 secure protocols that realize the necessary shuffling functionality. To enable a meaningful comparison, we adapt and unify existing security definitions into a consistent set of properties. We also present an overview of privacy-preserving technologies that rely on secure shufflers, offer practical guidelines for selecting appropriate protocols, and outline promising directions for future work.

data mining, machine learning, shuffler, (18 more...)

arXiv.org Artificial Intelligence

2507.01487

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

How do data owners say no? A case study of data consent mechanisms in web-scraped vision-language AI training datasets

Lee, Chung Peng, Hong, Rachel, Jiang, Harry H., Plotnik, Aster, Agnew, William, Morgenstern, Jamie

arXiv.org Artificial IntelligenceNov-25-2025

The internet has become the main source of data to train modern text-to-image or vision-language models, yet it is increasingly unclear whether web-scale data collection practices for training AI systems adequately respect data owners' wishes. Ignoring the owner's indication of consent around data usage not only raises ethical concerns but also has recently been elevated into lawsuits around copyright infringement cases. In this work, we aim to reveal information about data owners' consent to AI scraping and training, and study how it's expressed in DataComp, a popular dataset of 12.8 billion text-image pairs. We examine both the sample-level information, including the copyright notice, watermarking, and metadata, and the web-domain-level information, such as a site's Terms of Service (ToS) and Robots Exclusion Protocol. We estimate at least 122M of samples exhibit some indication of copyright notice in CommonPool, and find that 60\% of the samples in the top 50 domains come from websites with ToS that prohibit scraping. Furthermore, we estimate 9-13\% with 95\% confidence interval of samples from CommonPool to contain watermarks, where existing watermark detection methods fail to capture them in high fidelity. Our holistic methods and findings show that data owners rely on various channels to convey data consent, of which current AI data collection pipelines do not entirely respect. These findings highlight the limitations of the current dataset curation/release practice and the need for a unified data consent framework taking AI purposes into consideration.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.08637

Country: North America > United States (0.47)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Secure Sparse Matrix Multiplications and their Applications to Privacy-Preserving Machine Learning

Damie, Marc, Hahn, Florian, Peter, Andreas, Ramon, Jan

arXiv.org Artificial IntelligenceOct-17-2025

To preserve privacy, multi-party computation (MPC) enables executing Machine Learning (ML) algorithms on secret-shared or encrypted data. However, existing MPC frameworks are not optimized for sparse data. This makes them unsuitable for ML applications involving sparse data, e.g., recommender systems or genomics. Even in plaintext, such applications involve high-dimensional sparse data, that cannot be processed without sparsity-related optimizations due to prohibitively large memory requirements. Since matrix multiplication is central in ML algorithms, we propose MPC algorithms to multiply secret sparse matrices. On the one hand, our algorithms avoid the memory issues of the "dense" data representation of classic secure matrix multiplication algorithms. On the other hand, our algorithms can significantly reduce communication costs (some experiments show a factor 1000) for realistic problem sizes. We validate our algorithms in two ML applications in which existing protocols are impractical. An important question when developing MPC algorithms is what assumptions can be made. In our case, if the number of non-zeros in a row is a sensitive piece of information then a short runtime may reveal that the number of non-zeros is small. Existing approaches make relatively simple assumptions, e.g., that there is a universal upper bound to the number of non-zeros in a row. This often doesn't align with statistical reality, in a lot of sparse datasets the amount of data per instance satisfies a power law. We propose an approach which allows adopting a safe upper bound on the distribution of non-zeros in rows/columns of sparse matrices.

data mining, machine learning, multiplication, (19 more...)

arXiv.org Artificial Intelligence

2510.14894

Country: Europe (1.00)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

A Gradient analysis

Neural Information Processing SystemsOct-9-2025, 16:40:43 GMT

To better understand why our generated confounder noise can make the data unlearnable, we can also gain some insights according to optimization gradient. Empirically, if one image provides a large gradient in a backpropagation, this image has a lot of learnable knowledge, and vice versa. Figure 9 shows the accuracy curves of our method during the training epoch. Then we give a detailed discussion about this setting. To better understand this adaptive setting, we first illustrate the assumption on the data owner's The model trainer wishes to train a denoiser against the noise generated by the ConfounderGAN.

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

data owner

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection

34adeb8e3242824038aa65460a47c29e-Paper.pdf

Can Graph Neural Networks Expose Training Data Properties? An Efficient Risk Assessment Approach

9d28de8ff9bb6a3fa41fddfdc28f3bc1-AuthorFeedback.pdf

d346d91999074dd8d6073d4c3b13733b-Supplemental-Conference.pdf

Subgraph Federated Learning with Missing Neighbor Generation

How to Securely Shuffle? A survey about Secure Shufflers for privacy-preserving computations

How do data owners say no? A case study of data consent mechanisms in web-scraped vision-language AI training datasets

Secure Sparse Matrix Multiplications and their Applications to Privacy-Preserving Machine Learning

A Gradient analysis